Genotype F of Echovirus 25 with multiple recombination pattern have been persistently and extensively circulating in Chinese mainland

Echovirus 25 (E25), a member of the Enterovirus B (EV-B) species, can cause aseptic meningitis (AM), viral meningitis (VM), and acute flaccid paralysis (AFP). However, systematic studies on the molecular epidemiology of E25, especially those concerning its evolution and recombination, are lacking. In this study, 18 strains of E25, isolated from seven provinces of China between 2009 and 2018, were collected based on the Chinese hand, foot, and mouth disease (HFMD) surveillance network, and 95 sequences downloaded from GenBank were also screened. Based on the phylogenetic analysis of 113 full-length VP1 sequences worldwide, globally occurring E25 strains were classified into 9 genotypes (A–I), and genotype F was the dominant genotype in the Chinese mainland. The average nucleotide substitution rate of E25 was 6.08 × 10–3 substitutions/site/year, and six important transmission routes were identified worldwide. Seventeen recombination patterns were determined, of which genotype F can be divided into 9 recombination patterns. A positive selector site was found in the capsid protein region of genotype F. Recombination analysis and pressure selection analysis for genotype F showed multiple recombination patterns and evolution characteristics, which may be responsible for it being the dominant genotype in the Chinese mainland. This study provides a theoretical basis for the subsequent prevention and control of E25.

1957 in the USA 5 , and since the discovery of the prototype strain, E25 has been detected in several countries, including India 6 , China 7 , Australia 8 , the USA 9 , the United Kingdom 10 , France 11 , Italy 12 , Germany 13 , and Korea 14 .Although E25 has been detected in some national disease surveillance systems, sequences of E25 in GenBank are still relatively scarce, Moreover, there is currently no well-accepted classification system available for E25.In 2010, Chao Ling et al. classified two genotypes of E25 (genotypes A and B), based on nucleotide differences greater than 20%, with four E25 strains from Henan Province, China, and 13 E25 strains from GenBank 15 .In 2015, Li Hongjie et al. defined four lineages (A to D) based on one strain from Beijing, China, and 19 strains from GenBank 4 .Due to the small number of sequences included, the accuracy of the above study needs to be further investigated.
To date, studies on the genetic evolution, phylogenetic relationships, and spatiotemporal dynamics of E25 are still relatively scarce.In this study, we combined the 18 E25 isolates collected from the Chinese mainland with the full length of all E25 VP1 sequences retrieved from GenBank to construct a global gene sequence database of E25, thus providing a more systematic molecular epidemiological analysis of E25 on a global scale.

Nine genotypes were reclassified based on the VP1 sequences
According to the guidelines for genotype classification, 15-25% nucleotide differences between genotypes and less than 15% within genotypes 16 , 113 full-length VP1 sequences of E25 could be classified into nine genotypes (A-I) (Fig. 1).The nucleotide differences among the 9 genotypes ranged from 15.7 to 24.7%, and details about nucleotide and amino acid differences between the different genotypes are shown in Table 1.

Six important global geographic transmission paths of E25
Based on the above 110 full-length VP1 sequences of E25, a sequence database that includes six countries (the USA, China, Australia, France, Germany, and India) was established to analyze the global spatiotemporal dynamics of E25.Based on the Bayes factor (BF) ≥ 10 and posterior probability (PP) ≥ 0.5, six significant transmission pathways were identified: China to Germany (BF = 23.88,PP = 0.85);China to France (BF = 37.23, PP = 0.90);China to Australia (BF = 21.56,PP = 0.83); India to the USA (BF = 12.24, PP = 0.74); India to Australia (BF = 400.25,PP = 0.99); and France to the USA (BF = 12.17, PP = 0.74) (Fig. 3A, Supplementary Table S3).The above pathways show that E25 primarily spreads from Asia to the rest of the world.In addition, the results of the Markov reward showed that China dominates the output of E25 worldwide with a Markov reward value of 11.32, which is much higher than other countries (Fig. 3B, Supplementary Table S4).However, the results are somewhat biased due to the limited the number of available world series for each country.

Analysis of recombination patterns of E25
For a better understanding of the recombinant pattern of E25, a total of 37 sequences containing 7 genotypes (A, D-I) were used to construct phylogenetic trees based on the P1, P2, and P3 regions with the prototypes of other serotypes in the EVB group respectively.Among them, 18 sequences were obtained from this study and 19 were downloaded from GenBank (Fig. 4A-C).The results showed that all 37 strains of E25 clustered together in the P1 region and branched in the P2 and P3 regions.According to the position of sequences with varying genotypes on the phylogenetic tree in the P2 and P3 regions, we define that, if the nucleotide differences among the different linages were more than 15% it could be recognized as a valid recombination lineage.We defined 18 lineages (including one prototype lineage) to facilitate the analysis of their recombination patterns (Fig. 4A-C, Supplementary Table S5), and Genotype F could be divided into nine lineages (lineage F1-F9), while Genotype E had one lineage (lineage E).Analysis using Simplot software revealed significant differences in nucleotide similarity between Genotype E and Genotype F, with the nine lineages of Genotype F showing large differences in the P2 and P3 regions (Fig. 4D).This is also consistent with the results of the phylogenetic trees constructed by P2 and P3 and also shows that the recombination pattern differs between different lineages.According to the differences among these lineages, 17 recombination patterns were identified, of which Genotype F can be divided into 9 recombination patterns.For further validation, we randomly selected one strain from each of these 17 lineages as the reference sequence and used RDP4 software for recombination analysis, which showed that the breakpoint position information of each of the 17 reference sequences was different, and in addition, the serotypes of the prevalent strains that recombined with the seventeen reference strains also differed significantly (Fig. 4E, Supplementary Table S6).This also confirms that the reorganization patterns of the 17 lineages classified are indeed different.

A positive selection site was detected in genotype F
Since the P1 region sequences of other genotypes were rare or absent (Supplementary Table S6), we selected Genotype F, which had a relatively large number of sequences, for analysis.By analyzing the selection pressure of global Genotype F globally, we found that the average ratio of nonsynonymous to synonymous amino acid substitutions (dN/dS) in the P1 region was 0.207.There was a positive selection site at amino acid position 274 in the VP1 region, which was identified by both the MEME and SCLA models.Moreover, the amino acids at this site also differed among the reference strains on the 9 different lineages of Genotype F (Fig. 5).

Discussion
As one of the earliest enteroviruses discovered in the world, E25 is still identified by enterovirus surveillance systems in some countries, and there is a risk of it causing outbreaks in the population.In contrast to its risk, systematic research on the molecular epidemiology of E25 is lacking, especially for those related to evolution and recombination.Therefore, we performed a global-wide analysis using 113 complete VP1 sequences of E25 www.nature.com/scientificreports/from 1957 to 2018, together with 18 whole-genome sequences obtained in this study to provide a comprehensive molecular characterization and recombination analysis of E25.
Genotyping analysis can provide a theoretical basis for studying genetic characteristics and molecular evolution principles.In this study, we classified the global E25 full-length VP1sequences into nine genotypes.Compared with previous studies 4,15 , we established a much larger dataset including more sequences with more countries and disease types.In addition, according to published genotyping criteria of other thoroughly studied enteroviruses, our analysis of the genotypes among different regions by more rigorous classification criteria makes the results more reliable.Genotype F is the predominant genotype in the Chinese mainland and has been spreading in the Chinese mainland for 20 years.We also found that different evolutionary branches formed in the ML tree, which indicates that Genotype F continued to mutate and evolve during these years, and has the potential risk of causing outbreaks in the future.
BEAST analysis showed that the evolutionary rate of E25 is close to the evolutionary rate of some other enteroviruses [17][18][19] .Moreover, the evolutionary rate of genotype F was approximately the same as the total average rate of E25, which indicates that other genotypes of E25 have not undergone large-scale mutation and evolution, although the rate and ancestor cannot be estimated due to the small size of sequences in this study.From the community dynamic distribution map, we found that the E25 community size experienced a small change from 2004 to 2008, but did not cause large-scale disease outbreaks based on a literature search.This partly reflects the fact that E25 outbreaks and epidemics are easily overlooked and that the disease burden may be underestimated.Therefore, it is particularly important to strengthen the surveillance and monitoring of enteroviruses, especially the screening and detection of other enteroviruses, in the external environment and the population.Six important dissemination paths of E25 were found worldwide, and three of them involved export from China; they were strongly supported by the BF value and were consistent with the Markov reward results.Although there is a certain selection bias due to the large number of Chinese sequences, this result also reflects that with the frequent development of world trade, the cross-transmission of infectious diseases, especially the transmission of different viruses or different genotypes of viruses, cannot be ignored.
In addition, a positive selector site was found in the capsid protein region of genotype F by selective pressure analysis in E25 capsid protein.The VP1 structural protein contains some important antigenic sites and changes in amino acids in this region may affect viral virulence 20,21 , we speculated that changes in this site may increase the ability of the virus to invade the host.Therefore, the pathogenicity and transmission of the Genotype F virus in the population are improved, but further animal experiments are needed to verify this.
Although we have collected the sequences of E25 in GenBank as much as possible, after our screening and verification, we found that there are few effective sequences that can be used.Among these, sequences that collected from China have the most and complete information, which leads to a certain bias in sequence selection.Similarly, as most of the complete genome sequences in GenBank (10/19) lack important information including disease type and isolation origins, the recombination analysis in this study only compared the different recombination patterns among different genotypes.Meanwhile, although we attempted to conduct a geographic spreads analysis covering more countries by basing on partial VP1 sequences, we found that this would sacrifice more important nucleotide sites (requiring a reduction of more than 500 bp).Therefore, this study only provides a systematic bioinformatics analysis based on full-length of VP1, which also causes a certain limitation.
In summary, we identified E25 genotyping criteria and successfully classified 9 genotypes according to these criteria in this study.We analyzed the evolutionary origin and spatial transmission of E25 worldwide, and different genotypes showed different epidemic characteristics.Recombination analysis and pressure selection analysis of genotype F showed that there are more multiple recombination patterns in the P2 and P3 regions and a positive selector site in the capsid protein region.These factors may be responsible for Genotype F being

Figure 1 .
Figure 1.The maximum likelihood phylogenetic tree is based on the entire VP1 genome of 113 E25 genome sequences.Filled circle: •E25 strains in this study.Filled rhombus: ◆E25 prototype strain.

Figure 2 .
Figure 2. (A)The maximum clade credibility (MCC) phylogenetic tree was generated using the Markov chain Monte Carlo (MCMC) method based on 110 complete VP1 sequences of E25.The color of the branches represents the isolates of different genotypes.(B) Bayesian skyline plot of the whole E25 VP1 region sequence, reflecting the relative genetic diversity of E25 from 1957 to 2018.The X-axis is the time scale (year), and the Y-axis is the effective population size; the solid line is the estimated median, and the blue shadow is the 95% highest posterior density.(C) Bayesian skyline plot of theVP1 region sequence of E25 Genotype F.

Figure 3 .
Figure 3. (A) The global spatial transmission route of E25.(B) The histogram of the average number of state transitions is based on the geographical location of six countries.

Figure 4 .
Figure 4. NJ trees based on P1, P2, and P3 regions of the prototype sequence of all Enteroviruses B in the GenBank database with 37 E25 strains.Filled rhombus: indicates E25 prototype strain (JV-4), filled circle: E25 strains in this study.Numbers on codes indicate the bootstrap support of the node (1000 bootstrap replicate percentage).(A) P1 coding sequences; (B) P2 coding sequences; (C) P3 coding sequences.(D) Recombination breakpoints based on the whole genomes of different lineages as detected within genotypes.(E) Genomic mapping of representative strain recombination events predicted using RDP4 software for E25.

Figure 5 .
Figure 5. Positive amino acid selector sites in the VP1 region of E25.Positive selector sites are located at the blue square markers.

Table 1 .
Information on 9 genotypes of E25 relying on full-length VP1 sequences.